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1.1 Current Empirical Study 


• Goal : Estimate unbiased effect of Advanced 
Placement (AP) on related college grades 

• Propensity score methods may reduce bias 

• Problem : Propensity for taking AP varies across 
high schools, even after conditioning on student 
characteristics 

• We are unsure of the consequences on our conclusions 
of ignoring such dependence within high schools 

• Solution : Estimate multilevel propensity score 
model with random high school effects rtoiiegeBoard 


1.2 Picturing the Empirical Study 
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1.3 If we could design the perfect 
experiment... 


• We might: 

• take a cluster sampling approach to selecting a 
representative set of high schools; 

• randomly assign students of a variety of ability levels to 
take the Advanced Placement (AP) course; 

• follow all students to their college of choice and: 

• assign non-AP to take intro and subsequent course; and 

• assign AP to skip the intro and take the sequent. 

• We hope to find that the AP group tended to 
perform at least as well as the non-AP {%*»»*«* 


1.4 Choosing to Participate in AP 


• Construct a model of propensity for AP 
participation 

• Potentially important predictors of AP participation 

• Academic achievement 

• Subject area interest 

• Achievement motivation 

• Opportunities for participation 

• High school atmosphere (e.g., college-focused; pro-AP) 
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2.1 Existing Research 


• Griswold, Localio, and Mulrow (2010) 

• Compared: ignoring clusters; within-cluster; and multilevel match 

• Arpino and Mealli (2011) 

• Fixed cluster effects superior to either random or no effects 

• No normality assumption for cluster effects 

• Vanderweele (2008) 

• Ignorability & stable unit assumptions for cluster-level treatment 

• Outside the multilevel context, see: 

• Rosenbaum & Rubin foundational propensity score theory 

• Peter Austin recent simulations & best practice 
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2.2 What about College Effects? 


• Ignore college effects in estimating prop, scores 

• Data not cross-classified until students enter college 


Since the outcome is at the college level... 

• Only match AP- and non-AP-examinees: 

• at the same college; and 

• who took the same subsequent course. 

• Referred to as exact matching on these variables 

• Do not require that students attended the same high 
school 
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2.3 Some Notes on Propensity 
Score Matching Procedure 

• Greedy matching 

• As opposed to optimal matching 

• Within calipers 

• Caliper size = 0.2 * Population SD(propensity score) 

• On logistic scale 

• As opposed to probability scale; avoids scale issues 

• Use BLUP predicted propensity score? 

• Simulations will examine effects of either inc ing or 


excluding predicted random intercept effect oUegeBoard 



Standardized Difference 


2.4 Example w/ x = 6, No RE 


(a) No High School Random Effect in Model or Score 
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Replicate #59 from condition 4 simulated on 2012-03-13 
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2.5 Example w/ x = 6, Model RE, Not in PS 


(b) High School Random Effect in Model, but Not Score 
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Replicate #59 from condition 4 simulated on 2012-03-13 
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2.6 Example w/ x = 6, Model RE, Inc. in PS 


(c) High School Random Effect in Model and Score 



2.7 Comments on Example Replicate 


• When including HS random effects: 

• Propensity score d much larger, before matching 

• Better balance on gender & race after matching 

• Aside from that, either picture looks pretty good: 

• Approximate balance after matching. 

• Non-negative course grade d. 

• The problem with ignoring random effects is a 
violation of ignorability 

• Without HS, AP Participation is not MAR. j^coiiegeBoard 
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2.9 Course Grade cfs after Matching 
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1,000 replicates from condition 1 simulated on 2012-03-13 


Proportion of Replicates 


2.10 Course Grade cfs after Matching 
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1 ,000 replicates from condition 2 simulated on 201 2-03-1 3 
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Proportion of Replicates 
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1 ,000 replicates from condition 4 simulated on 201 2-03-1 3 
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1 ,000 replicates from condition 5 simulated on 201 2-03-1 3 
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1 ,000 replicates from condition 6 simulated on 201 2-03-1 3 
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1 ,000 replicates from condition 7 simulated on 201 2-03-1 3 


2.16 Average Course Grade Stats 
After Matching by Condition & PS Model 
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2.17 Simulation Results w / respect to t 


• As random HS intercept variance (r) increases... 

• number of within-caliper matches made increases; 

• ignoring HS RE -> mean recovered dis stable; 

• modeling HS random effects: 

• PS excludes HS RE -> similar to ignoring HS; and 

• PS includes HS RE 

• mean recovered d decreases; and 

• variance in recovered course grade d increases. 
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2.18 Other Simulation Results 


• When ignoring HS or excluding from prop, score: 

• Prop, score SD and therefore caliper size is smaller 

• More matches result 

• Are these better matches, than when modeling and 
including in the prop, score the HS random effect? 


• How to optimize both the quality of matches and 
sample size 


CollegeBoard 


3.1 Tying Simulations back to Application 


• A Placement Validity Study for Advanced 
Placement® Exam Scores 

• Forthcoming study with my colleague Maureen Ewing 

• 2006 cohort of first-time, first-year college students 

• Used official AP credit / placement granting policies 

• Needed sufficient number of AP examinees taking 
subsequent courses 

• Needed a good propensity score model and to achieve 
balance 

• Final sample: 10 exams; < 53 colleges £coiiegeBoard 


3.2 Mean Course Grades, after Matching 
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3.3 Summary of Results 


• AP participation differs across high schools 

• AP examinees significantly outperformed 
matched non-AP counterparts in five AP exams 

• Calculus AB, Calculus BC, Chemistry, Physics C: 
Mechanics, and United States Government and Politics 


• In the remaining five exams, no significant 
differences existed for course grades 


• Biology, Microeconomics, Macroeconomics, 
Psychology, and U.S. History 


• Criterion differences? Differential selection?£coiiegeBoard 


3.4 Questions, Comments, Suggestions? 


• Researchers are encouraged to freely express their 
professional judgment. Therefore, points of view or 
opinions stated in College Board presentations do not 
necessarily represent official College Board position or 
policy. 


• Please forward any questions, comments, and 
suggestions to: 

• bDatterson@colleaeboard.org 


And check out Research & Development’s site: 
• http://www.colleaeboard.com/research 
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